Search CORE

69 research outputs found

Computational re-engineering of Amylin sequence with reduced amyloidogenic potential

Author: Jérôme Waldispühl
Mohamed R Smaoui
Publication venue: Springer Nature
Publication date: 01/01/2015
Field of study

Modeling and predicting all-α transmembrane proteins including helix–helix pairing

Author: Steyaert Jean-Marc
Waldispühl Jérôme
Publication venue: Elsevier B.V.
Publication date
Field of study

AbstractModeling and predicting the structure of proteins is one of the most important challenges of computational biology. Exact physical models are too complex to provide feasible prediction tools and other ab initio methods only use local and probabilistic information to fold a given sequence. We show in this paper that all-α transmembrane protein secondary and super-secondary structures can be modeled with a multi-tape S-attributed grammar. An efficient structure prediction algorithm using both local and global constraints is designed and evaluated. Comparison with existing methods shows that the prediction rates as well as the definition level are sensibly increased. Furthermore this approach can be generalized to more complex proteins

Elsevier - Publisher Connector

Combining structure probing data on RNA mutants with evolutionary information reveals RNA-binding interfaces

Author: Ponty Yann
Reinharz Vladimir
Waldispühl Jérôme
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2016
Field of study

International audienceSystematic structure probing experiments (e.g. SHAPE) of RNA mutants such as the mutate-and-map protocol give us a direct access into the genetic robustness of ncRNA structures. Comparative studies of homologous sequences provide a distinct, yet complementary, approach to analyze structural and functional properties of non-coding RNAs. In this paper, we introduce a formal framework to combine the biochemical signal collected from mutate-and-map experiments, with the evolutionary information available in multiple sequence alignments. We apply neutral theory principles to detect complex long-range dependencies between nucleotides of a single stranded RNA, and implement these ideas into a software called aRNhAck. We illustrate the biological significance of this signal and show that the nucleotides networks calculated with aRNhAck are correlated with nucleotides located in RNA-RNA, RNA-protein, RNA-DNA and RNA-ligand interfaces. aRNhAck is freely available at http://csb.cs.mcgill.ca/arnhack

INRIA a CCSD electronic archive server

PubMed Central

HAL-Polytechnique

Using Structural and Evolutionary Information to Detect and Correct Pyrosequencing Errors in Noncoding RNAs.

Author: Ponty Yann
Reinharz Vladimir
Waldispühl Jérôme
Publication venue: 'Mary Ann Liebert Inc'
Publication date: 30/05/2013
Field of study

Extended version of RECOMB'13International audienceThe analysis of the sequence-structure relationship in RNA molecules is not only essential for evolutionary studies but also for concrete applications such as error-correction in next generation sequencing (NGS) technologies. The prohibitive sizes of the mutational and conformational landscapes, combined with the volume of data to process, require efficient algorithms to compute sequence-structure properties. In this article, we address the correction of NGS errors by calculating which mutations most increase the likelihood of a sequence to a given structure and RNA family. We introduce RNApyro, an efficient, linear time and space inside-outside algorithm that computes exact mutational probabilities under secondary structure and evolutionary constraints given as a multiple sequence alignment with a consensus structure. We develop a scoring scheme combining classical stacking base-pair energies to novel isostericity scores and apply our techniques to correct pointwise errors in 5s and 16s rRNA sequences. Our results suggest that RNApyro is a promising algorithm to complement existing tools in the NGS error-correction pipeline

arXiv.org e-Print Archive

INRIA a CCSD electronic archive server

PubMed Central

HAL-Polytechnique

A linear inside-outside algorithm for correcting sequencing errors in structured RNA sequences

Author: Ponty Yann
Reinharz Vladimir
Waldispühl Jérôme
Publication venue: HAL CCSD
Publication date: 01/01/2013
Field of study

International audienceAnalysis of the sequence-structure relationship in RNA molecules are essential to evolutionary studies but also to concrete applications such as error-correction methodologies in sequencing technologies. The prohibitive sizes of the mutational and conformational landscapes combined with the volume of data to proceed require e cient algorithms to compute sequence-structure properties. More speci cally, here we aim to calculate which mutations increase the most the likelihood of a sequence to a given structure and RNA family. In this paper, we introduce RNApyro, an e cient linear-time and space inside-outside algorithm that computes exact mutational probabilities under secondary structure and evolutionary constraints given as a multiple sequence alignment with a consensus structure. We develop a scoring scheme combining classical stacking base pair energies to novel isostericity scales, and apply our techniques to correct point-wise errors in 5s rRNA sequences. Our results suggest that RNApyro is a promising algorithm to complement existing tools in the NGS error-correction pipeline

CiteSeerX

INRIA a CCSD electronic archive server

Hal-Diderot

HAL-Polytechnique

Flexible RNA design under structure and sequence constraints using formal languages

Author: Denise Alain
Ponty Yann
Vialette Stéphane
Waldispühl Jérôme
Zhang Yi
Zhou Yu
Publication venue
Publication date: 01/08/2013
Field of study

The problem of RNA secondary structure design (also called inverse folding) is the following: given a target secondary structure, one aims to create a sequence that folds into, or is compatible with, a given structure. In several practical applications in biology, additional constraints must be taken into account, such as the presence/absence of regulatory motifs, either at a specific location or anywhere in the sequence. In this study, we investigate the design of RNA sequences from their targeted secondary structure, given these additional sequence constraints. To this purpose, we develop a general framework based on concepts of language theory, namely context-free grammars and finite automata. We efficiently combine a comprehensive set of constraints into a unifying context-free grammar of moderate size. From there, we use generic generic algorithms to perform a (weighted) random generation, or an exhaustive enumeration, of candidate sequences. The resulting method, whose complexity scales linearly with the length of the RNA, was implemented as a standalone program. The resulting software was embedded into a publicly available dedicated web server. The applicability demonstrated of the method on a concrete case study dedicated to Exon Splicing Enhancers, in which our approach was successfully used in the design of \emph{in vitro} experiments.Comment: ACM BCB 2013 - ACM Conference on Bioinformatics, Computational Biology and Biomedical Informatics (2013

arXiv.org e-Print Archive

HAL-CentraleSupelec

Crossref

INRIA a CCSD electronic archive server

Hal-Diderot

HAL-Ecole des Ponts ParisTech

HAL-Polytechnique

HAL - UPEC / UPEM

Investigating Mutations to Reduce Huntingtin Aggregation by Increasing Htt-N-Terminal Stability and Weakening Interactions with PolyQ Domain

Author: Cody Mazza-Anthony
Jérôme Waldispühl
Mohamed R. Smaoui
Publication venue: 'Hindawi Limited'
Publication date: 01/01/2016
Field of study

Huntington’s disease is a fatal autosomal genetic disorder characterized by an expanded glutamine-coding CAG repeat sequence in the huntingtin (Htt) exon 1 gene. The Htt protein associated with the disease misfolds into toxic oligomers and aggregate fibril structures. Competing models for the misfolding and aggregation phenomena have suggested the role of the Htt-N-terminal region and the CAG trinucleotide repeats (polyQ domain) in affecting aggregation propensities and misfolding. In particular, one model suggests a correlation between structural stability and the emergence of toxic oligomers, whereas a second model proposes that molecular interactions with the extended polyQ domain increase aggregation propensity. In this paper, we computationally explore the potential to reduce Htt aggregation by addressing the aggregation causes outlined in both models. We investigate the mutation landscape of the Htt-N-terminal region and explore amino acid residue mutations that affect its structural stability and hydrophobic interactions with the polyQ domain. Out of the millions of 3-point mutation combinations that we explored, the (L4K E12K K15E) was the most promising mutation combination that addressed aggregation causes in both models. The mutant structure exhibited extreme alpha-helical stability, low amyloidogenicity potential, a hydrophobic residue replacement, and removal of a solvent-inaccessible intermolecular side chain that assists oligomerization

Crossref

Directory of Open Access Journals

SPARCS: a web server to analyze (un)structured regions in coding RNA sequences.

Author: Blanchette Mathieu
Lecuyer Eric
Ponty Yann
Waldispühl Jérôme
Zhang Yang
Publication venue: 'Oxford University Press (OUP)'
Publication date: 08/06/2013
Field of study

International audienceMore than a simple carrier of the genetic information, messenger RNA (mRNA) coding regions can also harbor functional elements that evolved to control different post-transcriptional processes, such as mRNA splicing, localization and translation. Functional elements in RNA molecules are often encoded by secondary structure elements. In this aticle, we introduce Structural Profile Assignment of RNA Coding Sequences (SPARCS), an efficient method to analyze the (secondary) structure profile of protein-coding regions in mRNAs. First, we develop a novel algorithm that enables us to sample uniformly the sequence landscape preserving the dinucleotide frequency and the encoded amino acid sequence of the input mRNA. Then, we use this algorithm to generate a set of artificial sequences that is used to estimate the Z-score of classical structural metrics such as the sum of base pairing probabilities and the base pairing entropy. Finally, we use these metrics to predict structured and unstructured regions in the input mRNA sequence. We applied our methods to study the structural profile of the ASH1 genes and recovered key structural elements. A web server implementing this discovery pipeline is available at http://csb.cs.mcgill.ca/sparcs together with the source code of the sampling algorithm

INRIA a CCSD electronic archive server

PubMed Central

HAL-Polytechnique

A low-latency, big database system and browser for storage, querying and visualization of 3D genomic data

Author: Blanchette Mathieu
Butyaev Alexander
Cudré-Mauroux Philippe
Mavlyutov Ruslan
Waldispühl Jérôme
Publication venue
Publication date: 02/08/2017
Field of study

Recent releases of genome three-dimensional (3D) structures have the potential to transform our understanding of genomes. Nonetheless, the storage technology and visualization tools need to evolve to offer to the scientific community fast and convenient access to these data. We introduce simultaneously a database system to store and query 3D genomic data (3DBG), and a 3D genome browser to visualize and explore 3D genome structures (3DGB). We benchmark 3DBG against state-of-the-art systems and demonstrate that it is faster than previous solutions, and importantly gracefully scales with the size of data. We also illustrate the usefulness of our 3D genome Web browser to explore human genome structures. The 3D genome browser is available at http://3dgb.cs.mcgill.c

RERO DOC Digital Library